Molecular insights into the catalytic promiscuity of a bacterial diterpene synthase

Diterpene synthase VenA is responsible for assembling venezuelaene A with a unique 5-5-6-7 tetracyclic skeleton from geranylgeranyl pyrophosphate. VenA also demonstrates substrate promiscuity by accepting geranyl pyrophosphate and farnesyl pyrophosphate as alternative substrates. Herein, we report the crystal structures of VenA in both apo form and holo form in complex with a trinuclear magnesium cluster and pyrophosphate group. Functional and structural investigations on the atypical 115DSFVSD120 motif of VenA, versus the canonical Asp-rich motif of DDXX(X)D/E, reveal that the absent second Asp of canonical motif is functionally replaced by Ser116 and Gln83, together with bioinformatics analysis identifying a hidden subclass of type I microbial terpene synthases. Further structural analysis, multiscale computational simulations, and structure-directed mutagenesis provide significant mechanistic insights into the substrate selectivity and catalytic promiscuity of VenA. Finally, VenA is semi-rationally engineered into a sesterterpene synthase to recognize the larger substrate geranylfarnesyl pyrophosphate.

For the site-directed mutagenesis of VenA, the linearized expression vectors for VenA mutants were PCR amplified from pET28b-venA WT or pET28b-venA mutant using the specific primer pairs as shown in Supplementary Table 15. Then, these linear fragments were selfligated to generate recombinant vectors for the expression of mutated venA genes. To construct the VenA mutant and VenD co-expression vectors for terpenoids production in Eco-P (Supplementary Table 4), the DNA fragment of venD, including the coding sequence and the T7 promoter and terminator regions, was PCR amplified from pET28b-venD using the primer pair of T7VenD-F/T7VenD-R (Supplementary Table 15). Subsequently, this fragment was inserted into the linear mutant pET28b-venA to generate the co-expression vector (pET28b-venA mutant D). All resulting plasmids were verified by DNA sequencing before use.
Protein expression and purification. A single transformant of BL21(DE3) containing pET28b-venA WT , pET28b-venA mutant , pET-28a-sumo-venA WT or pET-28a-sumo-venA Δ1-15 were grown overnight at 37 °C in LB (tryptone 10 g/L, yeast extract 5 g/L and NaCl 10 g/L) media containing 50 µg/mL kanamycin. Then, the seed culture was used to inoculate 0.5 L of TB medium (tryptone 12 g/L, yeast extract 24 g/L, K2HPO4 9.4 g/L, KH2PO4 2.2 g/L, and glycerol 40 g/L) containing 50 μg/mL of kanamycin at a ratio of 1:100 and shaking cultured at 37 °C, 220 rpm. When cell density reached an OD600 of 0.6, the culture temperature was lowered to 16 °C and subjected to induction with 0.2 mM isopropyl β-D-1-thiogalactopyranoside (IPTG) for 18 h. The cells were collected by centrifugation at 7,000 × g for 15 min at 4 °C and then suspended in 100 mL of lysis buffer containing 25 mM HEPES, 150 mM NaCl and 10 mM imidazole (pH = 7.5). The solution was passed through a French press (GuangZhou JuNeng Biology and Technology Co. Ltd, Guangzhou, China) to disrupt the cells, and the lysate was centrifuged at 35,000 × g for 1 h to remove cell debris. The supernatant solution was loaded onto a Ni-NTA column pre-equilibrated with 100 mL lysis buffer until the A280 reached baseline. The recombinant protein was eluted using a 10-500 mM imidazole gradient, and the target protein-containing fractions were identified by SDS-PAGE analysis and pooled. Finally, the purified proteins were flash frozen by liquid nitrogen and stored at −80 °C for later use.
The recombinant proteins Sumo-VenA WT and Sumo-VenA Δ1-15 were then reacted with SUMO protease overnight at 4 °C when dialyzed against a buffer containing 25 mM HEPES, 150 mM NaCl (pH = 7.5) to remove imidazole. The resulting solution was loaded on a Ni-NTA column again and the flow-through fractions were collected. The protein concentration was determined by NanoDrop (Thermo Scientific). Prior to crystallization trials, protein solutions were stored at -80 °C.
Structure determination of 4, 5, 7, 13, 14, 15, 16, 17 and 18*. The identities of β-farnesene (4), germacrene D (5), geranyllinalool (13), geranylgeraniol (14), α-farnesene (15), nerolidol (16), and farnesol (17) Table 4) was picked and grown in 100 mL LB medium containing 50 μg/mL of kanamycin, 25 μg/mL of chloramphenicol, and 100 μg/mL of ampicillin at 37 °C, 220 rpm for 12 h. Then, the seed culture was used to inoculate 10-50 L TB medium containing 50 μg/mL of kanamycin, 25 μg/mL of chloramphenicol, and 100 μg/mL of ampicillin at a ratio of 1:100 at 37 °C, 220 rpm. When OD600 reached 1.0-1.5, the protein overexpression for terpene/terpenoid production was initiated by adding IPTG to the final concentration of 0.2 mM. After additional cultivation at 18 °C for 96 h, 500 μL fermentation broth of engineered E. coli was extracted by an equal volume of ethyl acetate, vortexed for 10 min, and centrifuged at 14,000 × g for 10 min. Then, the supernatant was directly used for GC/GC-MS analysis to monitor the terpene/terpenoid production. The yield of 9 was calculated by comparing the peak areas of two independent fermentation samples with that of an authentic standard with known concentration during GC analysis. Finally, the fermentation broth of engineered E. coli was extracted with an equal volume of ethyl acetate three times and the organic extracts were combined and concentrated by vacuum rotary evaporation. The crude extracts were further extracted using 50 mL n-hexane four times, which were concentrated in vacuo and re-dissolved in acetonitrile. For the isolation of 8, the concentrated extract of Eco-A Y88A D was purified by semi-preparative HPLC on a YMC C-18 column (10 × 250 mm, 5 μm) with 100% acetonitrile over 30 min at a flow rate of 3 mL/min. For the isolation of 9 and 10, the concentrated extracts of Eco-A W107A D and Eco-A V111W D were respectively purified by semi-preparative HPLC on a YMC C-18 column (10 × 250 mm, 5 μm) with 100% methanol over 30 min at a flow rate of 3 mL/min. For the isolation of 11 and 12, the concentrated extract of Eco-A F185A D was purified by semi-preparative HPLC according to their retention time at 32−33 min (11) and 31−32 min (12) on a YMC C-18 column (10 × 250 mm, 5 μm) with 100% acetonitrile using the procedure as follow: 0−17 min, the flow rate was 1.5 mL/min; 17.5−23 min, the flow rate was 2.0 mL/min; 23.5−29 min, the flow rate was 2.5 mL/min; and 29.5−35 min, the flow rate was 3 mL/min.
For in vitro preparation of 19, a 600 mL reaction mixture containing 10 μM VenA Y88A/W107A/V111A/T112A/F215A/F219A , 10 mM MgCl2, and 100 μM GFPP in Tris-HCl buffer (50 mM, 10% glycerol, pH = 7.4) was incubated at 30 °C for 2 h. The enzyme reaction mixture was extracted using 600 mL n-hexane for three times, which were combined and concentrated in vacuo and re-dissolved in acetonitrile. For the isolation of 19, the concentrated extract was purified by semi-preparative HPLC on a YMC C-18 column (10 × 250 mm, 5 μm) with 100% methanol over 35 min at a flow rate of 3 mL/min. ECD analysis. The ECD spectra were recorded on a JASCO J-1500 spectropolarimeter (JASCO Corporation, Tokyo, Japan) using the SpectraManager software. ECD spectra were generated using the program SpecDis21 by applying a Gaussian band shape with 0.3 eV width for 8 and 0.22 eV width for 10 from dipole length rotational strengths (Supplementary Figs. 35 and 73). The spectra of the conformers were combined using Boltzmann weighting with the lowest-energy conformations accounting for about 99% of the weights. The calculated spectra were shifted by 3 nm for 8, and 13 nm for 10 to facilitate comparisons to the experimental data.
Structure elucidation of 8, 9, 10, 11, 12 and 19. The purified 8 was colorless oil with [α] 20 D = +7.7 (c 0.117, CH2Cl2), UVmax = 210 nm; and its structure was determined to be a known diterpene (S)-(+)-cembrene A by ECD analysis, and comparing the NMR and optical rotation data with the published data from the previous literature 6 (Supplementary Table 5 Table 6, Supplementary Figs. 36-44). The planar structure of 9 was determined from 1 H-1 H and 1 H-13 C correlations. Briefly, the five-membered ring was deduced from the 1 H-1 H correlations of H11-H12-H13-H14 and the 1 H-13 C correlations between C1 and H11, H14. The eleven-membered ring was built based on the 1 H-1 H correlations of H2-H3, H5-H6-H7 and H9-H10-H11, and the 1 H-13 C correlations from H15 to C1, C2, C11, from H17 to C7, C8 and C9, and from H16 to C3, C4 and C5. The absolute configuration of 9 was assigned as 1S*, 11R*, 12S* based on the X-ray single crystal analysis ( Supplementary Fig. 37). H4−H5−H6−H2 as well as the 1 H-13 C correlations from H16 to C3, C14 and C15, from H2 to C1, C3, C6 and C20, and from H20 to C3, C4 and C15 revealed the two merged five-membered rings. Finally, the absolute configuration of 11 was assigned as 1R*, 2R*, 3R*, 6R*, 7R*, 10S*, 11R*, 14S* based on its crystal structure ( Supplementary Fig. 49). unassigned products were calculated from their peak areas relative to the peak areas of the homologues with known concentrations during GC−MS analysis. The conversion ratio of an enzymatic assay was calculated by comparing each product with its authentic standard or homologues during GC analysis. The conversion ratio of VenA wild type toward GGPP was assigned as 100%. The triplicated data (mean ± SD) were used to compare the relative catalytic efficiencies between VenA wild type and mutants toward GGPP, FPP and GPP.

Binding energy calculation.
To calculate the relative binding energies, we pairwise extracted the structures of several important residues and intermediates from the QM/MM scan trajectories. The residues were truncated at the α-carbon and saturated with hydrogen atoms.
The relative binding energies were computed using Gaussian 16 with the same method and basis set used in the QM region of the QM/MM calculation. To avoid basis set superposition errors, we employed counterpoise correction method developed by Boys and Bernardi 12 .

Supplementary Tables
Supplementary Table 1. X-ray diffraction data and structural refinement statistics of VenA crystal structures.

Data collection
Space group P32 P32    Table 5. NMR data of (S)-(+)-cembrene A (8       Note: "*" stands for the Gram-negative plant-pathogenic bacteria Supplementary Table 15. The primers used in this study. The residues Gln83 and Phe215 in G1/2 kink of VenA are marked by "•" and "▲", respectively. Supplementary Figure 11. The orientations of p-orbitals between C1 and C10 of the intermediate A (Fig. 5).  (v) The control reaction of (iv).